Lab 8. Image processing¶

Computational Methods for Geoscience - Fall 2023¶

Instructor: Eric Lindsey¶

Due: Oct. 31, 2023


In [1]:
# some useful imports and settings
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from ipywidgets import interactive
import scipy.fft
import scipy.signal
import cv2

%config InlineBackend.figure_format = 'retina' # better looking figures on high-resolution screens
# automatically reload modules when running, otherwise jupyter does not notice if our functions have changed
%load_ext autoreload
%autoreload 2

# not used in this lab:

#import netCDF4 as nc
#import datetime
#import scipy.optimize
#import time
#import multiprocessing as mp

Assignment 1. Re-order the bands of an image and display it¶

a. Read in the landsat data from 10/16/2023 over Albuquerque, and plot the bands separately.

The bands represent the following information:

  • Band 2: Blue
  • Band 3: Green
  • Band 4: Red
  • Band 5: Near-infrared

b. Create an "RGB" composite to view the image in true color. Set the bounds of your plot to view just the area around Albuquerque and the Sandias.

c. Then try computing NDVI using the Red and Near-infrared bands and plot that image.

In [34]:
blue = cv2.imread('LC09_L2SP_033036_20231016_20231017_02_T1_SR_B2.TIF', cv2.IMREAD_UNCHANGED)
green = cv2.imread('LC09_L2SP_033036_20231016_20231017_02_T1_SR_B3.TIF')
red = cv2.imread('LC09_L2SP_033036_20231016_20231017_02_T1_SR_B4.TIF')
near_infrared = cv2.imread('LC09_L2SP_033036_20231016_20231017_02_T1_SR_B5.TIF')

# It seems that my version of OpenCV doesn't like reading TIF files by default and every solution I've found online
# doesn't seem to fix anything. My code, as well as the code in the notes file when I run it again, just gives
# a completely black image. My best guess is that TIF files are using a different system to record the brightness
# of each channel that OpenCV doesn't like to scale properly. I have written my code as though it does actually work,
# but of course it doesn't. I'm also guessing that you used an older version of OpenCV to get your code to read the exact
# same file properly. It's also possible that something went wrong in the download when these files were compressed!

fig, axes = plt.subplots(2,2,figsize=(10,10))
axes[0,0].imshow(blue)
axes[0,0].set_title('Blue')
axes[1,0].imshow(green)
axes[1,0].set_title('Green')
axes[0,1].imshow(red)
axes[0,1].set_title('Red')
axes[1,1].imshow(near_infrared)
axes[1,1].set_title('Near Infrared')

combined = cv2.merge((red, green, blue))
plt.imshow(combined)
plt.show()

ndvi = (near_infrared - red) / (near_infrared + red)
plt.imshow(ndvi)
plt.show()
Out[34]:
"\nfig, axes = plt.subplots(2,2,figsize=(10,10))\naxes[0,0].imshow(blue)\naxes[0,0].set_title('Blue')\naxes[1,0].imshow(green)\naxes[1,0].set_title('Green')\naxes[0,1].imshow(red)\naxes[0,1].set_title('Red')\naxes[1,1].imshow(near_infrared)\naxes[1,1].set_title('Near Infrared')\n"

Assignment 2. Filtering the image¶

Apply the following filters to the grayscale image of my cat Pumpkin, and plot them all together with the original

  • 9x9 smoothing
  • 9x9 median
  • Edge detection

For the last one, use the function 'cv2.canny', which is a common edge detection filter. You'll have to look up how to use it!

In [15]:
cat = cv2.imread('pumpkin.jpg')

cat_blur = cv2.blur(cat,(9,9))
cat_median = cv2.medianBlur(cat,9)

cat_gray = cv2.cvtColor(cat,cv2.COLOR_BGR2GRAY)
cat_gray_blur = cv2.GaussianBlur(cat_gray, (3, 3), 0)
cat_edge = cv2.Canny(image=cat_gray_blur, threshold1=100, threshold2=200)

fig,axs = plt.subplots(2,2,figsize=(20,20))
axs[0,0].imshow(cat)
axs[0,1].imshow(cat_blur)
axs[1,0].imshow(cat_median)
axs[1,1].imshow(cat_edge)
[155 187 138]
Out[15]:
<matplotlib.image.AxesImage at 0x27bbb64f490>

Image thresholding and connected components¶

Assignment 3: labeling the icebergs on the figure¶

The information output by the connected components identifier contains other useful information besides just the shape areas. For example, we can get the object centroids within our for loop, with

(cx, cy) = centroids[i]

and bounding-box information with

x = stats[i, cv2.CC_STAT_LEFT]
y = stats[i, cv2.CC_STAT_TOP]
w = stats[i, cv2.CC_STAT_WIDTH]
h = stats[i, cv2.CC_STAT_HEIGHT]

Copy the code above and modify it to plot the original grayscale image, but with a red text label on the figure at the centroid location of each iceberg displaying that iceberg's area if it's larger than your chosen threshold, and draw a bounding box around the largest one.

In [3]:
# load the (grayscale) image and make a plot
img = cv2.imread('iceye_icebergs.png')
img_gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
fig,axs = plt.subplots(1,2,figsize=(20,10))
axs[0].imshow(img_gray,cmap='gray')

# threshold the image at a grayscale value of 127 (halfway through the range).
# we also have to specify the "threshold type", which here is cv2.THRESH_BINARY
# finally, notice that there are 2 outputs of this function - the threshold value and the image.
# see a nice tutorial here: https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html
thres_value, img_binary = cv2.threshold(img_gray,127,255,cv2.THRESH_BINARY)
axs[1].imshow(img_binary,cmap='gray')
plt.show()

# now apply the segmentation
# based (in part) on a long tutorial here: 
# https://pyimagesearch.com/2021/02/22/opencv-connected-component-labeling-and-analysis/

# this value defines what is "connected" - it is either 4 (edges only) or 8 (diagonals OK)
connectivity = 4 

# get the connected components:
numLabels, labels, stats, centroids = cv2.connectedComponentsWithStats(img_binary, connectivity, cv2.CV_32S)

# how many did we find?
print('identified', numLabels, 'icebergs')

# loop over the components, and if their areas are above the required size, add them to the map.
minArea = 1000
mask=np.zeros(np.shape(img_binary))
count=0

cx_list = []
cy_list = []
area_label_list = []

for i in range(numLabels):
    area = stats[i,cv2.CC_STAT_AREA]
    (cx, cy) = centroids[i]
    # note, we skip label 1 because this is the largest - usually the background of the image!
    if i>0 and area > minArea:
        #componentMask = (labels == i).astype("uint8") * 255
        #mask = cv2.bitwise_or(mask, componentMask)
        count += 1
        object_mask = (labels == i)
        mask[object_mask] = area
        cx_list.append(cx)
        cy_list.append(cy)
        area_label_list.append(area)
        
    if i>0 and area >= 10000:
        x = stats[i, cv2.CC_STAT_LEFT]
        y = stats[i, cv2.CC_STAT_TOP]
        w = stats[i, cv2.CC_STAT_WIDTH]
        h = stats[i, cv2.CC_STAT_HEIGHT]
        
# make any areas that are still zero into nan, for a white background:
mask[mask==0] = np.nan

print(count, 'are larger than the min size of',minArea)
fig, ax = plt.subplots(1,1,figsize=(20,10))
ax.imshow(mask)
#fig.colorbar(cm.ScalarMappable(norm=norm, cmap=cmap), ax=ax)

rect = patches.Rectangle((x, y), w, h, linewidth=1, edgecolor='r', facecolor='none')
ax.add_patch(rect)

for i in range(len(cx_list)):
    ax.text(cx_list[i],cy_list[i],area_label_list[i],
             color="red",fontsize=15)
identified 2176 icebergs
7 are larger than the min size of 1000

Assignment 4: connected components of your own image!¶

Find an image you think could be easily thresholded and separated into connected components, then apply the above method to it. Be creative - for example, maybe you could use this method to identify and count balloons in the sky, or olivine crystals from a thin section, or even get the relative abundances of different minerals, if you can identify them separately using the different color bands.

Or maybe you can count penguins!

In [10]:
img = cv2.imread('balloons.jpg')
img_gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
fig,axs = plt.subplots(1,2,figsize=(20,10))
axs[0].imshow(img_gray,cmap='gray')

thres_value, img_binary = cv2.threshold(img_gray,135,255,cv2.THRESH_BINARY)
axs[1].imshow(img_binary,cmap='gray')
plt.show()

connectivity = 4 

# get the connected components:
numLabels, labels, stats, centroids = cv2.connectedComponentsWithStats(img_binary, connectivity, cv2.CV_32S)

# how many did we find?
print('identified', numLabels, 'icebergs')

# loop over the components, and if their areas are above the required size, add them to the map.
minArea = 1000
mask=np.zeros(np.shape(img_binary))
count=0

cx_list = []
cy_list = []
area_label_list = []

for i in range(numLabels):
    area = stats[i,cv2.CC_STAT_AREA]
    (cx, cy) = centroids[i]
    # note, we skip label 1 because this is the largest - usually the background of the image!
    if i>0 and area > minArea:
        #componentMask = (labels == i).astype("uint8") * 255
        #mask = cv2.bitwise_or(mask, componentMask)
        count += 1
        object_mask = (labels == i)
        mask[object_mask] = area
        
# make any areas that are still zero into nan, for a white background:
mask[mask==0] = np.nan

fig, ax = plt.subplots(1,1,figsize=(20,10))
ax.imshow(mask)
#fig.colorbar(cm.ScalarMappable(norm=norm, cmap=cmap), ax=ax)
identified 584 icebergs
Out[10]:
<matplotlib.image.AxesImage at 0x2648f3d0490>
In [ ]: